Overview

Brought to you by YData

Dataset statistics

Number of variables20
Number of observations2057992
Missing cells2650580
Missing cells (%)6.4%
Duplicate rows6707
Duplicate rows (%)0.3%
Total size in memory314.0 MiB
Average record size in memory160.0 B

Variable types

Text5
Numeric6
Categorical5
DateTime4

Alerts

Dataset has 6707 (0.3%) duplicate rowsDuplicates
arrival_delay_check is highly overall correlated with departure_delay_checkHigh correlation
arrival_delay_m is highly overall correlated with departure_delay_mHigh correlation
departure_delay_check is highly overall correlated with arrival_delay_checkHigh correlation
departure_delay_m is highly overall correlated with arrival_delay_mHigh correlation
eva_nr is highly overall correlated with long and 2 other fieldsHigh correlation
info is highly overall correlated with lat and 3 other fieldsHigh correlation
lat is highly overall correlated with info and 2 other fieldsHigh correlation
long is highly overall correlated with eva_nr and 2 other fieldsHigh correlation
state is highly overall correlated with eva_nr and 4 other fieldsHigh correlation
zip is highly overall correlated with eva_nr and 3 other fieldsHigh correlation
arrival_delay_check is highly imbalanced (69.8%) Imbalance
departure_delay_check is highly imbalanced (69.7%) Imbalance
path has 211069 (10.3%) missing values Missing
arrival_plan has 211069 (10.3%) missing values Missing
arrival_change has 474922 (23.1%) missing values Missing
departure_change has 339378 (16.5%) missing values Missing
info has 1414133 (68.7%) missing values Missing
arrival_delay_m has 1404435 (68.2%) zeros Zeros
departure_delay_m has 1335764 (64.9%) zeros Zeros

Reproduction

Analysis started2024-12-07 08:47:40.873467
Analysis finished2024-12-07 08:49:11.636423
Duration1 minute and 30.76 seconds
Software versionydata-profiling vv4.12.0
Download configurationconfig.json

Variables

ID
Text

Distinct2026585
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
2024-12-07T09:49:12.681735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length33
Mean length32.847704
Min length28

Characters and Unicode

Total characters67600313
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1995178 ?
Unique (%)96.9%

Sample

1st row1573967790757085557-2407072312-14
2nd row349781417030375472-2407080017-1
3rd row7157250219775883918-2407072120-25
4th row349781417030375472-2407080017-2
5th row1983158592123451570-2407080010-3
ValueCountFrequency (%)
45229876672715008-2407111044-6 2
 
< 0.1%
8093000697463236928-2407102014-13 2
 
< 0.1%
2684232165437648261-2407131705-7 2
 
< 0.1%
1131061763278260844-2407092238-8 2
 
< 0.1%
4058401902925227636-2407091905-26 2
 
< 0.1%
3908068791773353441-2407092106-7 2
 
< 0.1%
2156867993929598961-2407121140-7 2
 
< 0.1%
5274685741865861980-2407120835-6 2
 
< 0.1%
4720559371039705295-2407080045-2 2
 
< 0.1%
7755496932300555138-2407121153-4 2
 
< 0.1%
Other values (2026575) 2057972
> 99.9%
2024-12-07T09:49:14.268233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 8482941
12.5%
0 8372506
12.4%
2 7783856
11.5%
4 7196516
10.6%
7 6513910
9.6%
3 5323158
7.9%
- 5149715
7.6%
8 4902097
7.3%
5 4832920
7.1%
6 4523564
6.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 67600313
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 8482941
12.5%
0 8372506
12.4%
2 7783856
11.5%
4 7196516
10.6%
7 6513910
9.6%
3 5323158
7.9%
- 5149715
7.6%
8 4902097
7.3%
5 4832920
7.1%
6 4523564
6.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 67600313
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 8482941
12.5%
0 8372506
12.4%
2 7783856
11.5%
4 7196516
10.6%
7 6513910
9.6%
3 5323158
7.9%
- 5149715
7.6%
8 4902097
7.3%
5 4832920
7.1%
6 4523564
6.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 67600313
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 8482941
12.5%
0 8372506
12.4%
2 7783856
11.5%
4 7196516
10.6%
7 6513910
9.6%
3 5323158
7.9%
- 5149715
7.6%
8 4902097
7.3%
5 4832920
7.1%
6 4523564
6.7%

line
Text

Distinct296
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
2024-12-07T09:49:14.564421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.6071987
Min length1

Characters and Unicode

Total characters3307602
Distinct characters33
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row18
3rd row1
4th row18
5th row33
ValueCountFrequency (%)
1 327015
 
15.9%
2 170604
 
8.3%
3 167018
 
8.1%
6 117296
 
5.7%
5 102742
 
5.0%
8 98742
 
4.8%
7 75469
 
3.7%
4 71594
 
3.5%
9 65811
 
3.2%
42 42352
 
2.1%
Other values (284) 819366
39.8%
2024-12-07T09:49:14.992953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 642474
19.4%
2 400714
12.1%
3 316028
9.6%
4 283504
8.6%
5 272022
8.2%
6 263607
8.0%
8 206698
 
6.2%
7 195654
 
5.9%
R 194509
 
5.9%
9 139102
 
4.2%
Other values (23) 393290
11.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3307602
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 642474
19.4%
2 400714
12.1%
3 316028
9.6%
4 283504
8.6%
5 272022
8.2%
6 263607
8.0%
8 206698
 
6.2%
7 195654
 
5.9%
R 194509
 
5.9%
9 139102
 
4.2%
Other values (23) 393290
11.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3307602
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 642474
19.4%
2 400714
12.1%
3 316028
9.6%
4 283504
8.6%
5 272022
8.2%
6 263607
8.0%
8 206698
 
6.2%
7 195654
 
5.9%
R 194509
 
5.9%
9 139102
 
4.2%
Other values (23) 393290
11.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3307602
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 642474
19.4%
2 400714
12.1%
3 316028
9.6%
4 283504
8.6%
5 272022
8.2%
6 263607
8.0%
8 206698
 
6.2%
7 195654
 
5.9%
R 194509
 
5.9%
9 139102
 
4.2%
Other values (23) 393290
11.9%

path
Text

Missing 

Distinct22142
Distinct (%)1.2%
Missing211069
Missing (%)10.3%
Memory size15.7 MiB
2024-12-07T09:49:15.203056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length1229
Median length626
Mean length181.46272
Min length4

Characters and Unicode

Total characters335147666
Distinct characters77
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1121 ?
Unique (%)0.1%

Sample

1st rowStolberg(Rheinl)Hbf Gl.44|Eschweiler-St.Jöris|Alsdorf Poststraße|Alsdorf-Mariadorf|Alsdorf-Kellersberg|Alsdorf-Annapark|Alsdorf-Busch|Herzogenrath-August-Schmidt-Platz|Herzogenrath-Alt-Merkstein|Herzogenrath|Kohlscheid|Aachen West|Aachen Schanz
2nd rowHamm(Westf)Hbf|Kamen|Kamen-Methler|Dortmund-Kurl|Dortmund-Scharnhorst|Dortmund Hbf|Bochum Hbf|Wattenscheid|Essen Hbf|Mülheim(Ruhr)Hbf|Duisburg Hbf|Düsseldorf Flughafen|Düsseldorf Hbf|Düsseldorf-Benrath|Leverkusen Mitte|Köln-Mülheim|Köln Messe/Deutz|Köln Hbf|Köln-Ehrenfeld|Horrem|Düren|Langerwehe|Eschweiler Hbf|Stolberg(Rheinl)Hbf
3rd rowAachen Hbf
4th rowHerzogenrath|Kohlscheid
5th rowHerzogenrath
ValueCountFrequency (%)
hbf 456103
 
3.4%
s)|berlin 449124
 
3.4%
allee|berlin 163190
 
1.2%
berlin 111135
 
0.8%
friedrichstraße 95783
 
0.7%
straße|berlin 90152
 
0.7%
am 89413
 
0.7%
flughafen 86299
 
0.7%
ostkreuz 85078
 
0.6%
rosenheimer 81764
 
0.6%
Other values (13355) 11538441
87.1%
2024-12-07T09:49:15.521374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 35962727
 
10.7%
r 26391616
 
7.9%
n 25364913
 
7.6%
a 17811571
 
5.3%
| 17622361
 
5.3%
i 15137595
 
4.5%
l 15100658
 
4.5%
t 14035768
 
4.2%
s 12308835
 
3.7%
h 11778931
 
3.5%
Other values (67) 143632691
42.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 335147666
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 35962727
 
10.7%
r 26391616
 
7.9%
n 25364913
 
7.6%
a 17811571
 
5.3%
| 17622361
 
5.3%
i 15137595
 
4.5%
l 15100658
 
4.5%
t 14035768
 
4.2%
s 12308835
 
3.7%
h 11778931
 
3.5%
Other values (67) 143632691
42.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 335147666
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 35962727
 
10.7%
r 26391616
 
7.9%
n 25364913
 
7.6%
a 17811571
 
5.3%
| 17622361
 
5.3%
i 15137595
 
4.5%
l 15100658
 
4.5%
t 14035768
 
4.2%
s 12308835
 
3.7%
h 11778931
 
3.5%
Other values (67) 143632691
42.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 335147666
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 35962727
 
10.7%
r 26391616
 
7.9%
n 25364913
 
7.6%
a 17811571
 
5.3%
| 17622361
 
5.3%
i 15137595
 
4.5%
l 15100658
 
4.5%
t 14035768
 
4.2%
s 12308835
 
3.7%
h 11778931
 
3.5%
Other values (67) 143632691
42.9%

eva_nr
Real number (ℝ)

High correlation 

Distinct1996
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8018251.2
Minimum8000001
Maximum8098360
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-12-07T09:49:15.651127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8000001
5-th percentile8000105
Q18001582
median8004136
Q38010207
95-th percentile8089080
Maximum8098360
Range98359
Interquartile range (IQR)8625

Descriptive statistics

Standard deviation31775.008
Coefficient of variation (CV)0.0039628352
Kurtosis1.0165251
Mean8018251.2
Median Absolute Deviation (MAD)3054
Skewness1.7137405
Sum1.6501497 × 1013
Variance1.0096511 × 109
MonotonicityNot monotonic
2024-12-07T09:49:15.767540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8004128 8732
 
0.4%
8089047 8245
 
0.4%
8000262 7814
 
0.4%
8004132 7598
 
0.4%
8004131 7382
 
0.4%
8004135 7378
 
0.4%
8004129 7366
 
0.4%
8004136 7324
 
0.4%
8089045 7085
 
0.3%
8003368 6828
 
0.3%
Other values (1986) 1982240
96.3%
ValueCountFrequency (%)
8000001 1488
0.1%
8000002 823
 
< 0.1%
8000004 848
 
< 0.1%
8000007 591
 
< 0.1%
8000009 829
 
< 0.1%
8000010 946
< 0.1%
8000011 589
 
< 0.1%
8000012 896
 
< 0.1%
8000013 2337
0.1%
8000014 756
 
< 0.1%
ValueCountFrequency (%)
8098360 531
 
< 0.1%
8089537 2180
 
0.1%
8089474 5831
0.3%
8089473 1530
 
0.1%
8089472 1538
 
0.1%
8089331 1678
 
0.1%
8089330 1898
 
0.1%
8089329 1787
 
0.1%
8089328 1916
 
0.1%
8089327 2763
0.1%

category
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
4
786721 
5
642557 
3
420922 
2
137077 
1
 
70715

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2057992
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row4
4th row5
5th row5

Common Values

ValueCountFrequency (%)
4 786721
38.2%
5 642557
31.2%
3 420922
20.5%
2 137077
 
6.7%
1 70715
 
3.4%

Length

2024-12-07T09:49:15.865591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-07T09:49:15.971155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 786721
38.2%
5 642557
31.2%
3 420922
20.5%
2 137077
 
6.7%
1 70715
 
3.4%

Most occurring characters

ValueCountFrequency (%)
4 786721
38.2%
5 642557
31.2%
3 420922
20.5%
2 137077
 
6.7%
1 70715
 
3.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2057992
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 786721
38.2%
5 642557
31.2%
3 420922
20.5%
2 137077
 
6.7%
1 70715
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2057992
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 786721
38.2%
5 642557
31.2%
3 420922
20.5%
2 137077
 
6.7%
1 70715
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2057992
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 786721
38.2%
5 642557
31.2%
3 420922
20.5%
2 137077
 
6.7%
1 70715
 
3.4%
Distinct1996
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
2024-12-07T09:49:16.099220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length30
Mean length14.651397
Min length4

Characters and Unicode

Total characters30152457
Distinct characters63
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAachen Hbf
2nd rowAachen Hbf
3rd rowAachen-Rothe Erde
4th rowAachen West
5th rowAachen West
ValueCountFrequency (%)
hbf 186959
 
5.9%
münchen 63047
 
2.0%
main 62404
 
2.0%
frankfurt 54552
 
1.7%
straße 39315
 
1.2%
berlin 34038
 
1.1%
stuttgart 27209
 
0.9%
bad 27077
 
0.8%
köln 25766
 
0.8%
ost 25294
 
0.8%
Other values (2079) 2639997
82.9%
2024-12-07T09:49:16.354805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3523672
 
11.7%
r 2329597
 
7.7%
n 2327081
 
7.7%
a 1803803
 
6.0%
t 1456625
 
4.8%
i 1322166
 
4.4%
l 1289613
 
4.3%
s 1286458
 
4.3%
h 1199314
 
4.0%
1127666
 
3.7%
Other values (53) 12486462
41.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 30152457
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 3523672
 
11.7%
r 2329597
 
7.7%
n 2327081
 
7.7%
a 1803803
 
6.0%
t 1456625
 
4.8%
i 1322166
 
4.4%
l 1289613
 
4.3%
s 1286458
 
4.3%
h 1199314
 
4.0%
1127666
 
3.7%
Other values (53) 12486462
41.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 30152457
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 3523672
 
11.7%
r 2329597
 
7.7%
n 2327081
 
7.7%
a 1803803
 
6.0%
t 1456625
 
4.8%
i 1322166
 
4.4%
l 1289613
 
4.3%
s 1286458
 
4.3%
h 1199314
 
4.0%
1127666
 
3.7%
Other values (53) 12486462
41.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 30152457
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 3523672
 
11.7%
r 2329597
 
7.7%
n 2327081
 
7.7%
a 1803803
 
6.0%
t 1456625
 
4.8%
i 1322166
 
4.4%
l 1289613
 
4.3%
s 1286458
 
4.3%
h 1199314
 
4.0%
1127666
 
3.7%
Other values (53) 12486462
41.4%

state
Categorical

High correlation 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size15.7 MiB
Nordrhein-Westfalen
342558 
Berlin
334037 
Bayern
329968 
Baden-Württemberg
252714 
Hessen
200022 
Other values (12)
598693 

Length

Max length22
Median length19
Mean length10.957535
Min length6

Characters and Unicode

Total characters22550520
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNordrhein-Westfalen
2nd rowNordrhein-Westfalen
3rd rowNordrhein-Westfalen
4th rowNordrhein-Westfalen
5th rowNordrhein-Westfalen

Common Values

ValueCountFrequency (%)
Nordrhein-Westfalen 342558
16.6%
Berlin 334037
16.2%
Bayern 329968
16.0%
Baden-Württemberg 252714
12.3%
Hessen 200022
9.7%
Hamburg 154711
7.5%
Sachsen 84676
 
4.1%
Niedersachsen 82602
 
4.0%
Rheinland-Pfalz 78824
 
3.8%
Brandenburg 58863
 
2.9%
Other values (7) 139017
6.8%

Length

2024-12-07T09:49:16.472165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
nordrhein-westfalen 342558
16.6%
berlin 334037
16.2%
bayern 329968
16.0%
baden-württemberg 252714
12.3%
hessen 200022
9.7%
hamburg 154711
7.5%
sachsen 84676
 
4.1%
niedersachsen 82602
 
4.0%
rheinland-pfalz 78824
 
3.8%
brandenburg 58863
 
2.9%
Other values (7) 139017
6.8%

Most occurring characters

ValueCountFrequency (%)
e 3534415
15.7%
n 2458846
 
10.9%
r 2327711
 
10.3%
a 1577652
 
7.0%
s 1095569
 
4.9%
B 986010
 
4.4%
l 977680
 
4.3%
i 931200
 
4.1%
t 915064
 
4.1%
d 832820
 
3.7%
Other values (25) 6913553
30.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22550520
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 3534415
15.7%
n 2458846
 
10.9%
r 2327711
 
10.3%
a 1577652
 
7.0%
s 1095569
 
4.9%
B 986010
 
4.4%
l 977680
 
4.3%
i 931200
 
4.1%
t 915064
 
4.1%
d 832820
 
3.7%
Other values (25) 6913553
30.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22550520
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 3534415
15.7%
n 2458846
 
10.9%
r 2327711
 
10.3%
a 1577652
 
7.0%
s 1095569
 
4.9%
B 986010
 
4.4%
l 977680
 
4.3%
i 931200
 
4.1%
t 915064
 
4.1%
d 832820
 
3.7%
Other values (25) 6913553
30.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22550520
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 3534415
15.7%
n 2458846
 
10.9%
r 2327711
 
10.3%
a 1577652
 
7.0%
s 1095569
 
4.9%
B 986010
 
4.4%
l 977680
 
4.3%
i 931200
 
4.1%
t 915064
 
4.1%
d 832820
 
3.7%
Other values (25) 6913553
30.7%

city
Text

Distinct1292
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Memory size15.7 MiB
2024-12-07T09:49:16.634787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length23
Mean length8.9922755
Min length3

Characters and Unicode

Total characters18506022
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAachen
2nd rowAachen
3rd rowAachen
4th rowAachen
5th rowAachen
ValueCountFrequency (%)
berlin 335199
 
13.8%
hamburg 154711
 
6.4%
münchen 118039
 
4.9%
main 87651
 
3.6%
am 82299
 
3.4%
frankfurt 69186
 
2.9%
köln 42863
 
1.8%
stuttgart 41450
 
1.7%
düsseldorf 38327
 
1.6%
bad 28394
 
1.2%
Other values (1345) 1424922
58.8%
2024-12-07T09:49:16.901564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2101384
 
11.4%
n 1842040
 
10.0%
r 1620113
 
8.8%
a 1150130
 
6.2%
i 1086512
 
5.9%
l 883160
 
4.8%
t 711947
 
3.8%
u 687371
 
3.7%
h 638318
 
3.4%
g 635350
 
3.4%
Other values (50) 7149697
38.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 18506022
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 2101384
 
11.4%
n 1842040
 
10.0%
r 1620113
 
8.8%
a 1150130
 
6.2%
i 1086512
 
5.9%
l 883160
 
4.8%
t 711947
 
3.8%
u 687371
 
3.7%
h 638318
 
3.4%
g 635350
 
3.4%
Other values (50) 7149697
38.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 18506022
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 2101384
 
11.4%
n 1842040
 
10.0%
r 1620113
 
8.8%
a 1150130
 
6.2%
i 1086512
 
5.9%
l 883160
 
4.8%
t 711947
 
3.8%
u 687371
 
3.7%
h 638318
 
3.4%
g 635350
 
3.4%
Other values (50) 7149697
38.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 18506022
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 2101384
 
11.4%
n 1842040
 
10.0%
r 1620113
 
8.8%
a 1150130
 
6.2%
i 1086512
 
5.9%
l 883160
 
4.8%
t 711947
 
3.8%
u 687371
 
3.7%
h 638318
 
3.4%
g 635350
 
3.4%
Other values (50) 7149697
38.6%

zip
Real number (ℝ)

High correlation 

Distinct1651
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean46284.135
Minimum1067
Maximum99974
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-12-07T09:49:17.023513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1067
5-th percentile7745
Q118119
median47051
Q370806
95-th percentile88427
Maximum99974
Range98907
Interquartile range (IQR)52687

Descriptive statistics

Standard deviation28213.241
Coefficient of variation (CV)0.60956614
Kurtosis-1.3679508
Mean46284.135
Median Absolute Deviation (MAD)26198
Skewness0.045418452
Sum9.5252333 × 1010
Variance7.9598699 × 108
MonotonicityNot monotonic
2024-12-07T09:49:17.133462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
80331 22358
 
1.1%
80639 13196
 
0.6%
10557 12961
 
0.6%
14057 11885
 
0.6%
10827 11676
 
0.6%
10117 11643
 
0.6%
60313 11368
 
0.6%
22525 9877
 
0.5%
10317 9631
 
0.5%
20354 9614
 
0.5%
Other values (1641) 1933782
94.0%
ValueCountFrequency (%)
1067 2458
0.1%
1069 2045
0.1%
1097 3301
0.2%
1109 1799
0.1%
1127 597
 
< 0.1%
1129 1882
0.1%
1159 982
 
< 0.1%
1187 566
 
< 0.1%
1219 917
 
< 0.1%
1237 1944
0.1%
ValueCountFrequency (%)
99974 421
< 0.1%
99947 453
< 0.1%
99880 424
< 0.1%
99867 494
< 0.1%
99817 453
< 0.1%
99752 252
< 0.1%
99734 496
< 0.1%
99610 353
< 0.1%
99518 279
< 0.1%
99510 360
< 0.1%

long
Real number (ℝ)

High correlation 

Distinct1995
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean10.183157
Minimum6.070715
Maximum14.97908
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-12-07T09:49:17.240249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6.070715
5-th percentile6.815137
Q18.494709
median9.944088
Q312.090548
95-th percentile13.513799
Maximum14.97908
Range8.908365
Interquartile range (IQR)3.595839

Descriptive statistics

Standard deviation2.2735233
Coefficient of variation (CV)0.2232631
Kurtosis-1.2260254
Mean10.183157
Median Absolute Deviation (MAD)1.694189
Skewness0.11328713
Sum20956846
Variance5.1689083
MonotonicityNot monotonic
2024-12-07T09:49:17.350024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.536537 8732
 
0.4%
13.283966 8245
 
0.4%
11.604971 7814
 
0.4%
11.565619 7598
 
0.4%
11.583234 7382
 
0.4%
11.575386 7378
 
0.4%
11.548572 7366
 
0.4%
11.593049 7324
 
0.4%
13.451646 7085
 
0.3%
6.975001 6828
 
0.3%
Other values (1985) 1982239
96.3%
ValueCountFrequency (%)
6.070715 1744
0.1%
6.07384 1206
0.1%
6.074485 1051
0.1%
6.091499 1488
0.1%
6.094486 1899
0.1%
6.097265 815
< 0.1%
6.116475 949
< 0.1%
6.124518 818
< 0.1%
6.203225 252
 
< 0.1%
6.207467 717
 
< 0.1%
ValueCountFrequency (%)
14.97908 608
< 0.1%
14.902088 272
 
< 0.1%
14.805774 577
< 0.1%
14.706775 348
< 0.1%
14.671941 461
< 0.1%
14.658435 480
< 0.1%
14.648866 264
 
< 0.1%
14.638027 266
 
< 0.1%
14.578802 280
 
< 0.1%
14.546496 716
< 0.1%

lat
Real number (ℝ)

High correlation 

Distinct1996
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean50.882081
Minimum47.411032
Maximum54.906839
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-12-07T09:49:17.455920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum47.411032
5-th percentile48.111413
Q149.353291
median51.087456
Q352.478542
95-th percentile53.564711
Maximum54.906839
Range7.495807
Interquartile range (IQR)3.125251

Descriptive statistics

Standard deviation1.7921961
Coefficient of variation (CV)0.035222539
Kurtosis-1.1316864
Mean50.882081
Median Absolute Deviation (MAD)1.42371
Skewness-0.1180777
Sum1.0471487 × 108
Variance3.2119669
MonotonicityNot monotonic
2024-12-07T09:49:17.567699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
48.142623 8732
 
0.4%
52.500737 8245
 
0.4%
48.12744 7814
 
0.4%
48.139452 7598
 
0.4%
48.134202 7382
 
0.4%
48.137048 7378
 
0.4%
48.141969 7366
 
0.4%
48.129168 7324
 
0.4%
52.505976 7085
 
0.3%
50.940874 6828
 
0.3%
Other values (1986) 1982239
96.3%
ValueCountFrequency (%)
47.411032 220
 
< 0.1%
47.44003 237
 
< 0.1%
47.456591 449
< 0.1%
47.491452 419
< 0.1%
47.513241 472
< 0.1%
47.544341 565
< 0.1%
47.5509 368
< 0.1%
47.552384 874
< 0.1%
47.555857 484
< 0.1%
47.556923 608
< 0.1%
ValueCountFrequency (%)
54.906839 233
< 0.1%
54.888814 364
< 0.1%
54.872142 371
< 0.1%
54.861997 381
< 0.1%
54.789605 563
< 0.1%
54.774039 281
< 0.1%
54.685934 373
< 0.1%
54.621166 311
< 0.1%
54.499457 515
< 0.1%
54.4720826 537
< 0.1%

arrival_plan
Date

Missing 

Distinct10084
Distinct (%)0.5%
Missing211069
Missing (%)10.3%
Memory size15.7 MiB
Minimum2024-07-07 23:37:00
Maximum2024-07-14 23:59:00
2024-12-07T09:49:17.685026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:49:17.796264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct10089
Distinct (%)0.5%
Missing1
Missing (%)< 0.1%
Memory size15.7 MiB
Minimum2024-07-08 00:00:00
Maximum2024-07-15 00:10:00
2024-12-07T09:49:17.904573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:49:18.023897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

arrival_change
Date

Missing 

Distinct10114
Distinct (%)0.6%
Missing474922
Missing (%)23.1%
Memory size15.7 MiB
Minimum2024-07-07 23:39:00
Maximum2024-07-15 01:03:00
2024-12-07T09:49:18.137941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:49:18.243601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

departure_change
Date

Missing 

Distinct10108
Distinct (%)0.6%
Missing339378
Missing (%)16.5%
Memory size15.7 MiB
Minimum2024-07-08 00:00:00
Maximum2024-07-15 01:04:00
2024-12-07T09:49:18.352369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:49:18.458495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

arrival_delay_m
Real number (ℝ)

High correlation  Zeros 

Distinct116
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean1.1767087
Minimum0
Maximum159
Zeros1404435
Zeros (%)68.2%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-12-07T09:49:18.571010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile6
Maximum159
Range159
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.4052415
Coefficient of variation (CV)2.8938695
Kurtosis106.92896
Mean1.1767087
Median Absolute Deviation (MAD)0
Skewness7.667469
Sum2421656
Variance11.59567
MonotonicityNot monotonic
2024-12-07T09:49:18.718357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1404435
68.2%
1 254614
 
12.4%
2 130275
 
6.3%
3 80385
 
3.9%
4 46057
 
2.2%
5 31418
 
1.5%
6 21921
 
1.1%
7 15746
 
0.8%
8 12240
 
0.6%
9 9843
 
0.5%
Other values (106) 51057
 
2.5%
ValueCountFrequency (%)
0 1404435
68.2%
1 254614
 
12.4%
2 130275
 
6.3%
3 80385
 
3.9%
4 46057
 
2.2%
5 31418
 
1.5%
6 21921
 
1.1%
7 15746
 
0.8%
8 12240
 
0.6%
9 9843
 
0.5%
ValueCountFrequency (%)
159 1
 
< 0.1%
157 2
< 0.1%
140 1
 
< 0.1%
136 1
 
< 0.1%
134 1
 
< 0.1%
133 3
< 0.1%
132 1
 
< 0.1%
120 1
 
< 0.1%
117 1
 
< 0.1%
116 3
< 0.1%

departure_delay_m
Real number (ℝ)

High correlation  Zeros 

Distinct121
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean1.2236404
Minimum0
Maximum159
Zeros1335764
Zeros (%)64.9%
Negative0
Negative (%)0.0%
Memory size15.7 MiB
2024-12-07T09:49:18.857294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile6
Maximum159
Range159
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.4155783
Coefficient of variation (CV)2.7913251
Kurtosis107.07563
Mean1.2236404
Median Absolute Deviation (MAD)0
Skewness7.6630691
Sum2518241
Variance11.666175
MonotonicityNot monotonic
2024-12-07T09:49:18.985815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1335764
64.9%
1 305641
 
14.9%
2 146180
 
7.1%
3 81436
 
4.0%
4 46490
 
2.3%
5 31286
 
1.5%
6 21722
 
1.1%
7 15693
 
0.8%
8 12271
 
0.6%
9 9764
 
0.5%
Other values (111) 51744
 
2.5%
ValueCountFrequency (%)
0 1335764
64.9%
1 305641
 
14.9%
2 146180
 
7.1%
3 81436
 
4.0%
4 46490
 
2.3%
5 31286
 
1.5%
6 21722
 
1.1%
7 15693
 
0.8%
8 12271
 
0.6%
9 9764
 
0.5%
ValueCountFrequency (%)
159 1
< 0.1%
157 1
< 0.1%
156 1
< 0.1%
137 1
< 0.1%
135 1
< 0.1%
134 2
< 0.1%
133 1
< 0.1%
132 2
< 0.1%
131 1
< 0.1%
120 1
< 0.1%

info
Categorical

High correlation  Missing 

Distinct7
Distinct (%)< 0.1%
Missing1414133
Missing (%)68.7%
Memory size15.7 MiB
Information
243525 
Störung
115698 
Bauarbeiten
96154 
Information. (Quelle: zuginfo.nrw)
78925 
Bauarbeiten. (Quelle: zuginfo.nrw)
72472 
Other values (2)
37085 

Length

Max length34
Median length11
Mean length16.535776
Min length7

Characters and Unicode

Total characters10646708
Distinct characters28
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBauarbeiten. (Quelle: zuginfo.nrw)
2nd rowInformation
3rd rowInformation
4th rowInformation
5th rowInformation

Common Values

ValueCountFrequency (%)
Information 243525
 
11.8%
Störung 115698
 
5.6%
Bauarbeiten 96154
 
4.7%
Information. (Quelle: zuginfo.nrw) 78925
 
3.8%
Bauarbeiten. (Quelle: zuginfo.nrw) 72472
 
3.5%
Störung. (Quelle: zuginfo.nrw) 28680
 
1.4%
Großstörung 8405
 
0.4%
(Missing) 1414133
68.7%

Length

2024-12-07T09:49:19.095401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-07T09:49:19.188478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
information 322450
32.1%
quelle 180077
17.9%
zuginfo.nrw 180077
17.9%
bauarbeiten 168626
16.8%
störung 144378
14.4%
großstörung 8405
 
0.8%

Most occurring characters

ValueCountFrequency (%)
n 1326463
 
12.5%
o 833382
 
7.8%
r 832341
 
7.8%
e 697406
 
6.6%
u 681563
 
6.4%
i 671153
 
6.3%
a 659702
 
6.2%
t 643859
 
6.0%
f 502527
 
4.7%
l 360154
 
3.4%
Other values (18) 3438158
32.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 10646708
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 1326463
 
12.5%
o 833382
 
7.8%
r 832341
 
7.8%
e 697406
 
6.6%
u 681563
 
6.4%
i 671153
 
6.3%
a 659702
 
6.2%
t 643859
 
6.0%
f 502527
 
4.7%
l 360154
 
3.4%
Other values (18) 3438158
32.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 10646708
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 1326463
 
12.5%
o 833382
 
7.8%
r 832341
 
7.8%
e 697406
 
6.6%
u 681563
 
6.4%
i 671153
 
6.3%
a 659702
 
6.2%
t 643859
 
6.0%
f 502527
 
4.7%
l 360154
 
3.4%
Other values (18) 3438158
32.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 10646708
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 1326463
 
12.5%
o 833382
 
7.8%
r 832341
 
7.8%
e 697406
 
6.6%
u 681563
 
6.4%
i 671153
 
6.3%
a 659702
 
6.2%
t 643859
 
6.0%
f 502527
 
4.7%
l 360154
 
3.4%
Other values (18) 3438158
32.3%

arrival_delay_check
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size15.7 MiB
on_time
1947184 
delay
 
110807

Length

Max length7
Median length7
Mean length6.8923154
Min length5

Characters and Unicode

Total characters14184323
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowon_time
2nd rowon_time
3rd rowon_time
4th rowon_time
5th rowon_time

Common Values

ValueCountFrequency (%)
on_time 1947184
94.6%
delay 110807
 
5.4%
(Missing) 1
 
< 0.1%

Length

2024-12-07T09:49:19.310308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-07T09:49:19.396693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
on_time 1947184
94.6%
delay 110807
 
5.4%

Most occurring characters

ValueCountFrequency (%)
e 2057991
14.5%
o 1947184
13.7%
n 1947184
13.7%
_ 1947184
13.7%
t 1947184
13.7%
i 1947184
13.7%
m 1947184
13.7%
d 110807
 
0.8%
l 110807
 
0.8%
a 110807
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 14184323
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 2057991
14.5%
o 1947184
13.7%
n 1947184
13.7%
_ 1947184
13.7%
t 1947184
13.7%
i 1947184
13.7%
m 1947184
13.7%
d 110807
 
0.8%
l 110807
 
0.8%
a 110807
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 14184323
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 2057991
14.5%
o 1947184
13.7%
n 1947184
13.7%
_ 1947184
13.7%
t 1947184
13.7%
i 1947184
13.7%
m 1947184
13.7%
d 110807
 
0.8%
l 110807
 
0.8%
a 110807
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 14184323
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 2057991
14.5%
o 1947184
13.7%
n 1947184
13.7%
_ 1947184
13.7%
t 1947184
13.7%
i 1947184
13.7%
m 1947184
13.7%
d 110807
 
0.8%
l 110807
 
0.8%
a 110807
 
0.8%

departure_delay_check
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size15.7 MiB
on_time
1946797 
delay
 
111194

Length

Max length7
Median length7
Mean length6.8919393
Min length5

Characters and Unicode

Total characters14183549
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowon_time
2nd rowon_time
3rd rowon_time
4th rowon_time
5th rowon_time

Common Values

ValueCountFrequency (%)
on_time 1946797
94.6%
delay 111194
 
5.4%
(Missing) 1
 
< 0.1%

Length

2024-12-07T09:49:19.491616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-07T09:49:19.582872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
on_time 1946797
94.6%
delay 111194
 
5.4%

Most occurring characters

ValueCountFrequency (%)
e 2057991
14.5%
o 1946797
13.7%
n 1946797
13.7%
_ 1946797
13.7%
t 1946797
13.7%
i 1946797
13.7%
m 1946797
13.7%
d 111194
 
0.8%
l 111194
 
0.8%
a 111194
 
0.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 14183549
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 2057991
14.5%
o 1946797
13.7%
n 1946797
13.7%
_ 1946797
13.7%
t 1946797
13.7%
i 1946797
13.7%
m 1946797
13.7%
d 111194
 
0.8%
l 111194
 
0.8%
a 111194
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 14183549
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 2057991
14.5%
o 1946797
13.7%
n 1946797
13.7%
_ 1946797
13.7%
t 1946797
13.7%
i 1946797
13.7%
m 1946797
13.7%
d 111194
 
0.8%
l 111194
 
0.8%
a 111194
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 14183549
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 2057991
14.5%
o 1946797
13.7%
n 1946797
13.7%
_ 1946797
13.7%
t 1946797
13.7%
i 1946797
13.7%
m 1946797
13.7%
d 111194
 
0.8%
l 111194
 
0.8%
a 111194
 
0.8%

Interactions

2024-12-07T09:48:58.731143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:50.442266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:52.151096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:53.721579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:55.421357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:57.086800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:59.021009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:50.784739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:52.413057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:54.037445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:55.702430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:57.358029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:59.281374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:51.048168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:52.684777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:54.299317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:55.967200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:57.611847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:59.544030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:51.297517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:52.927714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:54.575536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:56.209598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:57.878881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:59.828930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:51.565233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:53.196791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:54.845706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:56.531735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:58.144909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:49:00.131959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:51.861734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:53.451993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:55.140546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:56.804543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-12-07T09:48:58.420257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-12-07T09:49:19.646911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
arrival_delay_checkarrival_delay_mcategorydeparture_delay_checkdeparture_delay_meva_nrinfolatlongstatezip
arrival_delay_check1.0000.4320.0310.9100.4140.0860.0860.0960.1020.1140.107
arrival_delay_m0.4321.0000.0100.4280.824-0.0860.027-0.251-0.1050.0190.226
category0.0310.0101.0000.0320.0110.1630.1270.1760.1750.2090.158
departure_delay_check0.9100.4280.0321.0000.4340.0870.0850.0960.1020.1140.107
departure_delay_m0.4140.8240.0110.4341.000-0.0940.027-0.270-0.1110.0190.245
eva_nr0.086-0.0860.1630.087-0.0941.0000.3130.3480.6540.706-0.530
info0.0860.0270.1270.0850.0270.3131.0000.5150.5630.5770.540
lat0.096-0.2510.1760.096-0.2700.3480.5151.0000.2580.688-0.833
long0.102-0.1050.1750.102-0.1110.6540.5630.2581.0000.650-0.410
state0.1140.0190.2090.1140.0190.7060.5770.6880.6501.0000.696
zip0.1070.2260.1580.1070.245-0.5300.540-0.833-0.4100.6961.000

Missing values

2024-12-07T09:49:00.842318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-12-07T09:49:02.806256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-12-07T09:49:07.865901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

IDlinepatheva_nrcategorystationstatecityziplonglatarrival_plandeparture_planarrival_changedeparture_changearrival_delay_mdeparture_delay_minfoarrival_delay_checkdeparture_delay_check
01573967790757085557-2407072312-1420Stolberg(Rheinl)Hbf Gl.44|Eschweiler-St.Jöris|Alsdorf Poststraße|Alsdorf-Mariadorf|Alsdorf-Kellersberg|Alsdorf-Annapark|Alsdorf-Busch|Herzogenrath-August-Schmidt-Platz|Herzogenrath-Alt-Merkstein|Herzogenrath|Kohlscheid|Aachen West|Aachen Schanz80000012Aachen HbfNordrhein-WestfalenAachen52064.06.09149950.7678002024-07-08 00:00:002024-07-08 00:01:002024-07-08 00:03:002024-07-08 00:04:003.03.0NaNon_timeon_time
1349781417030375472-2407080017-118NaN80000012Aachen HbfNordrhein-WestfalenAachen52064.06.09149950.767800NaN2024-07-08 00:17:00NaNNaN0.00.0NaNon_timeon_time
27157250219775883918-2407072120-251Hamm(Westf)Hbf|Kamen|Kamen-Methler|Dortmund-Kurl|Dortmund-Scharnhorst|Dortmund Hbf|Bochum Hbf|Wattenscheid|Essen Hbf|Mülheim(Ruhr)Hbf|Duisburg Hbf|Düsseldorf Flughafen|Düsseldorf Hbf|Düsseldorf-Benrath|Leverkusen Mitte|Köln-Mülheim|Köln Messe/Deutz|Köln Hbf|Köln-Ehrenfeld|Horrem|Düren|Langerwehe|Eschweiler Hbf|Stolberg(Rheinl)Hbf80004064Aachen-Rothe ErdeNordrhein-WestfalenAachen52066.06.11647550.7702022024-07-08 00:03:002024-07-08 00:04:002024-07-08 00:03:002024-07-08 00:04:000.00.0NaNon_timeon_time
3349781417030375472-2407080017-218Aachen Hbf80004045Aachen WestNordrhein-WestfalenAachen52072.06.07071550.7803602024-07-08 00:20:002024-07-08 00:21:00NaNNaN0.00.0NaNon_timeon_time
41983158592123451570-2407080010-333Herzogenrath|Kohlscheid80004045Aachen WestNordrhein-WestfalenAachen52072.06.07071550.7803602024-07-08 00:20:002024-07-08 00:21:002024-07-08 00:20:002024-07-08 00:21:000.00.0NaNon_timeon_time
5-5293934437045765939-2407080023-24Herzogenrath80004045Aachen WestNordrhein-WestfalenAachen52072.06.07071550.7803602024-07-08 00:30:002024-07-08 00:31:002024-07-08 00:30:002024-07-08 00:31:000.00.0Bauarbeiten. (Quelle: zuginfo.nrw)on_timeon_time
66845762881043426854-2407072357-6RB33Lindern|Geilenkirchen|Übach-Palenberg|Herzogenrath|Kohlscheid80004045Aachen WestNordrhein-WestfalenAachen52072.06.07071550.7803602024-07-08 00:58:002024-07-08 00:58:00NaNNaN0.00.0NaNon_timeon_time
7-2100556839975301087-2407072307-1318Liège-Guillemins|Bressoux|Vise|Eijsden|Maastricht Randwyck|Maastricht|Meerssen|Valkenburg(NL)|Heerlen|Landgraaf|Eygelshoven Markt|Herzogenrath80004045Aachen WestNordrhein-WestfalenAachen52072.06.07071550.7803602024-07-08 00:37:002024-07-08 00:41:002024-07-08 00:37:002024-07-08 00:41:000.00.0NaNon_timeon_time
8-7696913984968518161-2407080037-113NaN80000023Aalen HbfBaden-WürttembergAalen73430.010.09627148.841013NaN2024-07-08 00:37:00NaN2024-07-08 00:37:000.00.0Informationon_timeon_time
9-6027587483204218492-2407080013-48Bremen Hbf|Bremen-Sebaldsbrück|Bremen-Mahndorf80004134AchimNiedersachsenAchim28832.09.03044753.0159902024-07-08 00:27:002024-07-08 00:27:002024-07-08 01:16:002024-07-08 01:17:0049.050.0NaNdelaydelay
IDlinepatheva_nrcategorystationstatecityziplonglatarrival_plandeparture_planarrival_changedeparture_changearrival_delay_mdeparture_delay_minfoarrival_delay_checkdeparture_delay_check
2057982-7544299009042287777-2407142237-91Villingen(Schwarzw)|Donaueschingen|Hüfingen Mitte|Döggingen|Unadingen|Bachheim|Löffingen|Rötenbach(Baden)80043315Neustadt (Schwarzw)Baden-WürttembergNeustadt79822.08.21088347.9101982024-07-14 23:27:002024-07-14 23:28:002024-07-14 23:27:002024-07-14 23:28:000.00.0Bauarbeitenon_timeon_time
20579836612581188908164643-2407142033-171Trier Hbf|Konz|Saarburg(Bz Trier)|Mettlach|Merzig(Saar)|Dillingen(Saar)|Saarlouis Hbf|Völklingen|Saarbrücken Hbf|St Ingbert|Homburg(Saar)Hbf|Landstuhl|Kaiserslautern Hbf|Hochspeyer|Lambrecht(Pfalz)|Neustadt(Weinstr)Hbf80044894Neustadt (Weinstr) BöbigRheinland-PfalzNeustadt67433.08.15831349.3542452024-07-14 23:01:002024-07-14 23:02:002024-07-14 23:02:002024-07-14 23:02:001.00.0NaNon_timeon_time
20579844875328236568005990-2407142300-82Kaiserslautern Hbf|Hochspeyer|Frankenstein(Pfalz)|Weidenthal|Neidenfels|Lambrecht(Pfalz)|Neustadt(Weinstr)Hbf80044894Neustadt (Weinstr) BöbigRheinland-PfalzNeustadt67433.08.15831349.3542452024-07-14 23:33:002024-07-14 23:34:002024-07-14 23:34:002024-07-14 23:35:001.01.0NaNon_timeon_time
2057985788723344751923009-2407142304-42Schifferstadt|Böhl-Iggelheim|Haßloch(Pfalz)80044894Neustadt (Weinstr) BöbigRheinland-PfalzNeustadt67433.08.15831349.3542452024-07-14 23:18:002024-07-14 23:18:002024-07-14 23:18:002024-07-14 23:19:000.01.0NaNon_timeon_time
20579866612581188908164643-2407142033-161Trier Hbf|Konz|Saarburg(Bz Trier)|Mettlach|Merzig(Saar)|Dillingen(Saar)|Saarlouis Hbf|Völklingen|Saarbrücken Hbf|St Ingbert|Homburg(Saar)Hbf|Landstuhl|Kaiserslautern Hbf|Hochspeyer|Lambrecht(Pfalz)80002752Neustadt (Weinstr) HbfRheinland-PfalzNeustadt67434.08.14075749.3495532024-07-14 22:56:002024-07-14 23:00:002024-07-14 22:56:002024-07-14 23:00:000.00.0NaNon_timeon_time
20579874875328236568005990-2407142300-72Kaiserslautern Hbf|Hochspeyer|Frankenstein(Pfalz)|Weidenthal|Neidenfels|Lambrecht(Pfalz)80002752Neustadt (Weinstr) HbfRheinland-PfalzNeustadt67434.08.14075749.3495532024-07-14 23:30:002024-07-14 23:32:002024-07-14 23:30:002024-07-14 23:33:000.01.0NaNon_timeon_time
20579882971209219135860640-2407142336-16NaN80002752Neustadt (Weinstr) HbfRheinland-PfalzNeustadt67434.08.14075749.349553NaN2024-07-14 23:36:00NaN2024-07-14 23:36:000.00.0NaNon_timeon_time
2057989788723344751923009-2407142304-52Schifferstadt|Böhl-Iggelheim|Haßloch(Pfalz)|Neustadt-Böbig80002752Neustadt (Weinstr) HbfRheinland-PfalzNeustadt67434.08.14075749.3495532024-07-14 23:21:002024-07-14 23:30:002024-07-14 23:21:002024-07-14 23:30:000.00.0NaNon_timeon_time
20579908280296046192255306-2407142239-31Mannheim Hbf|Ludwigshafen(Rhein) Mitte80002752Neustadt (Weinstr) HbfRheinland-PfalzNeustadt67434.08.14075749.3495532024-07-14 23:00:002024-07-14 23:02:002024-07-14 23:05:002024-07-14 23:06:005.04.0NaNon_timeon_time
2057991-2936232225014219596-2407142332-1S2NaN80043224Neustadt a RübenbergeNiederNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN

Duplicate rows

Most frequently occurring

IDlinepatheva_nrcategorystationstatecityziplonglatarrival_plandeparture_planarrival_changedeparture_changearrival_delay_mdeparture_delay_minfoarrival_delay_checkdeparture_delay_check# duplicates
0-1003145420136048192-2407100551-651Gera Hbf|Hermsdorf-Klosterlausnitz|Stadtroda|Jena-Göschwitz|Jena West80103662WeimarThüringenWeimar99423.011.32645850.9914872024-07-10 06:49:002024-07-10 07:04:002024-07-10 06:51:002024-07-10 07:06:002.02.0NaNon_timeon_time2
1-1008819848758697010-2407130714-226Grafing Bahnhof|Kirchseeon|Eglharting|Zorneding|Baldham|Vaterstetten|Haar|Gronsdorf|München-Trudering|München-Berg am Laim|München Leuchtenbergring|München Ost|München Rosenheimer Platz|München Isartor|München Marienplatz|München Karlsplatz|München Hbf (tief)|München Hackerbrücke|München Donnersbergerbrücke|München Hirschgarten|München-Laim80041582München-PasingBayernMünchen81241.011.46187248.1498522024-07-13 07:59:002024-07-13 08:01:002024-07-13 08:02:002024-07-13 08:03:003.02.0NaNon_timeon_time2
2-1009540259073221553-2407142134-106Starnberg|Starnberg Nord|Gauting|Stockdorf|Planegg|Gräfelfing|Lochham|München-Westkreuz|München-Pasing80041513München-Laim PbfBayernMünchen80639.011.50366948.1443712024-07-14 21:59:002024-07-14 22:00:002024-07-14 22:01:002024-07-14 22:02:002.02.0Bauarbeitenon_timeon_time2
3-1010076636343338093-2407101633-116Köln-Worringen|Köln-Blumenberg|Köln-Chorweiler Nord|Köln-Chorweiler|Köln Volkhovener Weg|Köln-Longerich|Köln Geldernstr./Parkgürtel|Köln-Nippes|Köln Hansaring|Köln Hbf80033681Köln Messe/DeutzNordrhein-WestfalenKöln50679.06.97500150.9408742024-07-10 16:59:002024-07-10 17:00:002024-07-10 16:59:002024-07-10 17:00:000.00.0Information. (Quelle: zuginfo.nrw)on_timeon_time2
4-1012813851155274121-2407111424-718Maastricht|Meerssen|Valkenburg(NL)|Heerlen|Landgraaf|Eygelshoven Markt80028063HerzogenrathNordrhein-WestfalenHerzogenrath52134.06.09448650.8709162024-07-11 14:59:002024-07-11 15:00:00NaNNaN0.00.0NaNon_timeon_time2
5-1012813851155274121-2407141424-718Maastricht|Meerssen|Valkenburg(NL)|Heerlen|Landgraaf|Eygelshoven Markt80028063HerzogenrathNordrhein-WestfalenHerzogenrath52134.06.09448650.8709162024-07-14 14:59:002024-07-14 15:00:00NaNNaN0.00.0NaNon_timeon_time2
6-1014485518442214187-2407080436-746Ilmenau|Ilmenau Pörlitzer Höhe|Ilmenau-Roda|Elgersburg|Geraberg|Martinroda80102745Plaue (Thür)ThüringenPlaue99338.010.90869850.7783932024-07-08 04:57:002024-07-08 05:06:002024-07-08 04:57:002024-07-08 05:06:000.00.0NaNon_timeon_time2
7-1014485518442214187-2407090436-746Ilmenau|Ilmenau Pörlitzer Höhe|Ilmenau-Roda|Elgersburg|Geraberg|Martinroda80102745Plaue (Thür)ThüringenPlaue99338.010.90869850.7783932024-07-09 04:57:002024-07-09 05:06:002024-07-09 04:57:002024-07-09 05:06:000.00.0NaNon_timeon_time2
8-1014485518442214187-2407100436-746Ilmenau|Ilmenau Pörlitzer Höhe|Ilmenau-Roda|Elgersburg|Geraberg|Martinroda80102745Plaue (Thür)ThüringenPlaue99338.010.90869850.7783932024-07-10 04:57:002024-07-10 05:06:002024-07-10 04:57:002024-07-10 05:06:000.00.0NaNon_timeon_time2
9-1014485518442214187-2407120436-746Ilmenau|Ilmenau Pörlitzer Höhe|Ilmenau-Roda|Elgersburg|Geraberg|Martinroda80102745Plaue (Thür)ThüringenPlaue99338.010.90869850.7783932024-07-12 04:57:002024-07-12 05:06:002024-07-12 04:57:002024-07-12 05:06:000.00.0NaNon_timeon_time2